54 research outputs found

    Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

    Full text link
    Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight the potential of dexterous hands, which possess adaptability and versatility, capable of seamlessly transitioning between different modes of functionality without the need for re-grasping or external tools. However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks. We present Sequential Dexterity, a general system based on reinforcement learning (RL) that chains multiple dexterous policies for achieving long-horizon task goals. The core of the system is a transition feasibility function that progressively finetunes the sub-policies for enhancing chaining success rate, while also enables autonomous policy-switching for recovery from failures and bypassing redundant stages. Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand. More details and video results could be found at https://sequential-dexterity.github.ioComment: CoRL 202

    Spiking PointNet: Spiking Neural Networks for Point Clouds

    Full text link
    Recently, Spiking Neural Networks (SNNs), enjoying extreme energy efficiency, have drawn much research attention on 2D visual recognition and shown gradually increasing application potential. However, it still remains underexplored whether SNNs can be generalized to 3D recognition. To this end, we present Spiking PointNet in the paper, the first spiking neural model for efficient deep learning on point clouds. We discover that the two huge obstacles limiting the application of SNNs in point clouds are: the intrinsic optimization obstacle of SNNs that impedes the training of a big spiking model with large time steps, and the expensive memory and computation cost of PointNet that makes training a big spiking point model unrealistic. To solve the problems simultaneously, we present a trained-less but learning-more paradigm for Spiking PointNet with theoretical justifications and in-depth experimental analysis. In specific, our Spiking PointNet is trained with only a single time step but can obtain better performance with multiple time steps inference, compared to the one trained directly with multiple time steps. We conduct various experiments on ModelNet10, ModelNet40 to demonstrate the effectiveness of Spiking PointNet. Notably, our Spiking PointNet even can outperform its ANN counterpart, which is rare in the SNN field thus providing a potential research direction for the following work. Moreover, Spiking PointNet shows impressive speedup and storage saving in the training phase.Comment: Accepted by NeurIP

    Learning a Universal Human Prior for Dexterous Manipulation from Human Preference

    Full text link
    Generating human-like behavior on robots is a great challenge especially in dexterous manipulation tasks with robotic hands. Even in simulation with no sample constraints, scripting controllers is intractable due to high degrees of freedom, and manual reward engineering can also be hard and lead to non-realistic motions. Leveraging the recent progress on Reinforcement Learning from Human Feedback (RLHF), we propose a framework to learn a universal human prior using direct human preference feedback over videos, for efficiently tuning the RL policy on 20 dual-hand robot manipulation tasks in simulation, without a single human demonstration. One task-agnostic reward model is trained through iteratively generating diverse polices and collecting human preference over the trajectories; it is then applied for regularizing the behavior of polices in the fine-tuning stage. Our method empirically demonstrates more human-like behaviors on robot hands in diverse tasks including even unseen tasks, indicating its generalization capability

    Dynamic Handover: Throw and Catch with Bimanual Hands

    Full text link
    Humans throw and catch objects all the time. However, such a seemingly common skill introduces a lot of challenges for robots to achieve: The robots need to operate such dynamic actions at high-speed, collaborate precisely, and interact with diverse objects. In this paper, we design a system with two multi-finger hands attached to robot arms to solve this problem. We train our system using Multi-Agent Reinforcement Learning in simulation and perform Sim2Real transfer to deploy on the real robots. To overcome the Sim2Real gap, we provide multiple novel algorithm designs including learning a trajectory prediction model for the object. Such a model can help the robot catcher has a real-time estimation of where the object will be heading, and then react accordingly. We conduct our experiments with multiple objects in the real-world system, and show significant improvements over multiple baselines. Our project page is available at \url{https://binghao-huang.github.io/dynamic_handover/}.Comment: Accepted at CoRL 2023. https://binghao-huang.github.io/dynamic_handover

    Membrane Potential Batch Normalization for Spiking Neural Networks

    Full text link
    As one of the energy-efficient alternatives of conventional neural networks (CNNs), spiking neural networks (SNNs) have gained more and more interest recently. To train the deep models, some effective batch normalization (BN) techniques are proposed in SNNs. All these BNs are suggested to be used after the convolution layer as usually doing in CNNs. However, the spiking neuron is much more complex with the spatio-temporal dynamics. The regulated data flow after the BN layer will be disturbed again by the membrane potential updating operation before the firing function, i.e., the nonlinear activation. Therefore, we advocate adding another BN layer before the firing function to normalize the membrane potential again, called MPBN. To eliminate the induced time cost of MPBN, we also propose a training-inference-decoupled re-parameterization technique to fold the trained MPBN into the firing threshold. With the re-parameterization technique, the MPBN will not introduce any extra time burden in the inference. Furthermore, the MPBN can also adopt the element-wised form, while these BNs after the convolution layer can only use the channel-wised form. Experimental results show that the proposed MPBN performs well on both popular non-spiking static and neuromorphic datasets. Our code is open-sourced at \href{https://github.com/yfguo91/MPBN}{MPBN}.Comment: Accepted by ICCV202

    RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks

    Full text link
    Spiking Neural Networks (SNNs) as one of the biology-inspired models have received much attention recently. It can significantly reduce energy consumption since they quantize the real-valued membrane potentials to 0/1 spikes to transmit information thus the multiplications of activations and weights can be replaced by additions when implemented on hardware. However, this quantization mechanism will inevitably introduce quantization error, thus causing catastrophic information loss. To address the quantization error problem, we propose a regularizing membrane potential loss (RMP-Loss) to adjust the distribution which is directly related to quantization error to a range close to the spikes. Our method is extremely simple to implement and straightforward to train an SNN. Furthermore, it is shown to consistently outperform previous state-of-the-art methods over different network architectures and datasets.Comment: Accepted by ICCV202

    On the Circular Polarisation of Repeating Fast Radio Bursts

    Full text link
    Fast spinning (e.g., sub-second) neutron star with ultra-strong magnetic fields (or so-called magnetar) is one of the promising origins of repeating fast radio bursts (FRBs). Here we discuss circularly polarised emissions produced by propagation effects in the magnetosphere of fast spinning magnetars. We argue that the polarisation-limiting region is well beyond the light cylinder, suggesting that wave mode coupling effects are unlikely to produce strong circular polarisation for fast spinning magnetars. Cyclotron absorption could be significant if the secondary plasma density is high. However, high degrees of circular polarisation can only be produced with large asymmetries in electrons and positrons. We draw attention to the non-detection of circular polarisation in current observations of known repeating FRBs. We suggest that the circular polarisation of FRBs could provide key information on their origins and help distinguish different radiation mechanisms.Comment: ApJ accepte

    Clinical comparison of percutaneous transforaminal endoscopic discectomy and unilateral biportal endoscopic discectomy for single-level lumbar disc herniation

    Get PDF
    PurposeTo compare the clinical outcomes of percutaneous transforaminal endoscopic discectomy (PTED) and unilateral biportal endoscopic discectomy (UBE) for the treatment of single-level lumbar disc herniation (LDH).Materials and methodsFrom January 2020 to November 2021, 62 patients with single-level LDH were retrospectively reviewed. All patients underwent spinal surgeries at the Affiliated Hospital of Chengde Medical University and Beijing Tongren Hospital, Capital Medical University. Among them, 30 patients were treated with UBE, and 32 were treated with PTED. The patients were followed up for at least one year. Patient demographics and perioperative outcomes were reviewed before and after surgery. The Oswestry Disability Index (ODI), visual analog scale (VAS) for back pain and leg pain, and modified MacNab criteria were used to evaluate the clinical outcomes. x-ray examinations were performed one year after surgery to assess the stability of the lumbar spine.ResultsThe mean ages in the UBE and PTED groups were 46.7 years and 48.0 years, respectively. Compared to the UBE group, the PTED group had better VAS scores for back pain at 1 and 7 days after surgery (3.06 ± 0.80 vs. 4.03 ± 0.81, P < 0.05; 2.81 ± 0.60 vs. 3.70 ± 0.79, P < 0.05). The UBE and PTED groups demonstrated significant improvements in the VAS score for leg pain and ODI score, and no significant differences were found between the groups at any time after the first month (P > 0.05). Although the good-to-excellent rate of the modified MacNab criteria in the UBE group was similar to that in the PTED group (86.7% vs. 87.5%, P > 0.05), PTED was advantageous in terms of the operation time, estimated blood loss, incision length, and length of postoperative hospital stay.ConclusionsBoth UBE and PTED have favorable outcomes in patients with single-level LDH. However, PTED is superior to UBE in terms of short-term postoperative back pain relief and perioperative quality of life
    • …
    corecore